278 research outputs found

    Effect of forename string on author name disambiguation

    Full text link
    In author name disambiguation, author forenames are used to decide which name instances are disambiguated together and how much they are likely to refer to the same author. Despite such a crucial role of forenames, their effect on the performance of heuristic (string matching) and algorithmic disambiguation is not well understood. This study assesses the contributions of forenames in author name disambiguation using multiple labeled data sets under varying ratios and lengths of full forenames, reflecting realā€world scenarios in which an author is represented by forename variants (synonym) and some authors share the same forenames (homonym). The results show that increasing the ratios of full forenames substantially improves both heuristic and machineā€learningā€based disambiguation. Performance gains by algorithmic disambiguation are pronounced when many forenames are initialized or homonyms are prevalent. As the ratios of full forenames increase, however, they become marginal compared to those by string matching. Using a small portion of forename strings does not reduce much the performances of both heuristic and algorithmic disambiguation methods compared to using fullā€length strings. These findings provide practical suggestions, such as restoring initialized forenames into a fullā€string format via record linkage for improved disambiguation performances.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155924/1/asi24298.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155924/2/asi24298_am.pd

    Scaleā€free collaboration networks: An author name disambiguation perspective

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/149559/1/asi24158.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/149559/2/asi24158_am.pd

    A Syllable-based Technique for Word Embeddings of Korean Words

    Full text link
    Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which word representation is composed of trained syllable vectors. Our model successfully produces morphologically meaningful representation of Korean words compared to the original Skip-gram embeddings. The results also show that it is quite robust to the Out-of-Vocabulary problem.Comment: 5 pages, 3 figures, 1 table. Accepted for EMNLP 2017 Workshop - The 1st Workshop on Subword and Character level models in NLP (SCLeM

    Hybrid Deep Learning Architecture to Forecast Maximum Load Duration Using Time-of-Use Pricing Plans

    Get PDF
    Load forecasting has received crucial research attention to reduce peak load and contribute to the stability of power grid using machine learning or deep learning models. Especially, we need the adequate model to forecast the maximum load duration based on time-of-use, which is the electricity usage fare policy in order to achieve the goals such as peak load reduction in a power grid. However, the existing single machine learning or deep learning forecasting cannot easily avoid overfitting. Moreover, a majority of the ensemble or hybrid models do not achieve optimal results for forecasting the maximum load duration based on time-of-use. To overcome these limitations, we propose a hybrid deep learning architecture to forecast maximum load duration based on time-of-use. Experimental results indicate that this architecture could achieve the highest average of recall and accuracy (83.43%) compared to benchmarkmodels. To verify the effectiveness of the architecture, another experimental result shows that energy storage system (ESS) scheme in accordance with the forecast results of the proposed model (LSTM-MATO) in the architecture could provide peak load cost savings of 17,535,700KRWeach year comparing with original peak load costs without the method. Therefore, the proposed architecture could be utilized for practical applications such as peak load reduction in the grid
    • ā€¦
    corecore